Data
Normalization
Model options
Network processing
cor/cov method: pearson
Data
Normalization
See prepare.tcga.survival.data function for how to configure parameters.
This uses data packages from obtained from ‘github:averissimo/tcga’ that contain FPKM expression levels.
prepare.tcga.survival.data('prad.data.2018.10.11',
'primary.solid.tumor',
normalization = 'center',
log2.pre.normalize = TRUE,
handle.duplicates = 'keep_first',
coding.genes = TRUE,
subtract.surv.column = '')
## [INFO] Loaded data from TCGA: prad.data.2018.10.11
## [INFO] type of tissue: primary.solid.tumor
## [INFO] observations (individuals): 445 (10 event / 435 censored)
## [INFO] variables (genes): 19850
Network processing
cor/cov method: pearson
trans.fun is a double power to scale the valuessee ?glmSparseNet::heuristicScale or ?glmSparseNet::hubHeuristic
# see ?glmSparseNet::heuristic.scale
trans.fun <- function(x) {
heuristicScale(x) + 0.2
}
| ensembl_gene_id | degree | external_gene_name |
|---|---|---|
| ENSG00000150991 | 10394 | UBC |
| ENSG00000170325 | 6346 | PRDM10 |
| ENSG00000100300 | 5797 | TSPO |
| ENSG00000174775 | 5614 | HRAS |
| ENSG00000142208 | 5573 | AKT1 |
| ENSG00000111640 | 5363 | GAPDH |
| ENSG00000163631 | 5260 | ALB |
| ENSG00000177606 | 4952 | JUN |
| ENSG00000141510 | 4936 | TP53 |
| ENSG00000197122 | 4903 | SRC |
| ENSG00000254647 | 4592 | INS |
| ENSG00000112062 | 4187 | MAPK14 |
| ENSG00000170315 | 4114 | UBB |
| ENSG00000170345 | 4010 | FOS |
## [INFO] Size of sets: (size/events)
## * Train: 80.00% :: 356 / 8
## * Test: 20.00% :: 89 / 2
## [INFO] Number of variables per model:
| Model | BaseModel | Alpha | TargetVars | nvars |
|---|---|---|---|---|
| Elastic Net | Elastic Net | 0.60 | 3 | 4 |
| Hub | Elastic Net | 0.60 | 3 | 3 |
| Orphan | Elastic Net | 0.60 | 3 | 3 |
| Elastic Net | Hub | 0.10 | 13 | 14 |
| Hub | Hub | 0.10 | 13 | 13 |
| Orphan | Hub | 0.10 | 13 | 14 |
| Elastic Net | Orphan | 0.10 | 5 | 6 |
| Hub | Orphan | 0.10 | 5 | 5 |
| Orphan | Orphan | 0.10 | 5 | 5 |
## [INFO] note, selected variables could be slightly different from target, to have more accuracy increase nlambda in code
Calculated using the inferred models and the train/test datasets. The higher the better.
| weighted | penalization | project | tissue | cutoff | coding.genes | alpha | model | nvars | km.train | km.test | c.index.train | c.index.test |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.60 | Elastic Net | 4 | 0.0082306 | 0.1434727 | 0.8864542 | 0.1860465 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.60 | Hub | 3 | 0.0783487 | 0.9784762 | 0.6932271 | 0.3488372 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.60 | Orphan | 3 | 0.2083740 | 0.4942273 | 0.8390093 | -1.0000000 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.10 | Elastic Net | 14 | 0.0039942 | 0.1868517 | 0.9063745 | 0.3023256 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.10 | Hub | 13 | 0.0292004 | 0.9784762 | 0.7898406 | 0.4651163 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.10 | Orphan | 14 | 0.0434207 | 0.1100129 | 0.9143426 | 0.7209302 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.10 | Elastic Net | 6 | 0.0054774 | 0.1868517 | 0.8605578 | 0.2093023 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.10 | Hub | 5 | 0.0344144 | 0.9596664 | 0.7360558 | 0.3837209 |
| FALSE | string | prad.data.2018.10.11 | primary.solid.tumor | 0 | TRUE | 0.10 | Orphan | 5 | 0.0260847 | 0.5987003 | 0.8624128 | 0.6119403 |
(not running at the moment)
| metric | model | base.model | target.vars | alpha | mean | std | median | min |
|---|---|---|---|---|---|---|---|---|
| C-Index (Test set) | Hub | Elastic Net | 3 | 0.60 | 0.337 | 0.678 | 0.557 | -1 |
| C-Index (Test set) | Hub | Hub | 13 | 0.10 | 0.328 | 0.680 | 0.554 | -1 |
| C-Index (Test set) | Hub | Orphan | 5 | 0.10 | 0.295 | 0.683 | 0.472 | -1 |
| C-Index (Test set) | Elastic Net | Elastic Net | 3 | 0.60 | -0.214 | 0.864 | -1.000 | -1 |
| C-Index (Test set) | Elastic Net | Hub | 13 | 0.10 | 0.542 | 0.344 | 0.619 | -1 |
| C-Index (Test set) | Elastic Net | Orphan | 5 | 0.10 | 0.091 | 0.778 | 0.397 | -1 |
| C-Index (Test set) | Orphan | Elastic Net | 3 | 0.60 | -0.254 | 0.819 | -1.000 | -1 |
| C-Index (Test set) | Orphan | Hub | 13 | 0.10 | 0.570 | 0.358 | 0.655 | -1 |
| C-Index (Test set) | Orphan | Orphan | 5 | 0.10 | 0.377 | 0.524 | 0.554 | -1 |
| Log-rank (Test set) | Hub | Elastic Net | 3 | 0.60 | 0.447 | 0.340 | 0.305 | 0.0143 |
| Log-rank (Test set) | Hub | Hub | 13 | 0.10 | 0.453 | 0.344 | 0.305 | 0.00855 |
| Log-rank (Test set) | Hub | Orphan | 5 | 0.10 | 0.453 | 0.347 | 0.304 | 0.0101 |
| Log-rank (Test set) | Elastic Net | Elastic Net | 3 | 0.60 | 0.439 | 0.264 | 0.406 | 0.000311 |
| Log-rank (Test set) | Elastic Net | Hub | 13 | 0.10 | 0.421 | 0.314 | 0.298 | 0.0126 |
| Log-rank (Test set) | Elastic Net | Orphan | 5 | 0.10 | 0.471 | 0.290 | 0.378 | 0.00701 |
| Log-rank (Test set) | Orphan | Elastic Net | 3 | 0.60 | 0.475 | 0.260 | 0.495 | 0.000608 |
| Log-rank (Test set) | Orphan | Hub | 13 | 0.10 | 0.433 | 0.320 | 0.290 | 0.0244 |
| Log-rank (Test set) | Orphan | Orphan | 5 | 0.10 | 0.464 | 0.325 | 0.317 | 0.0143 |
| base.model | r.squared |
|---|---|
| Elastic Net with target nvars: 3 alpha: 0.60 | 0.0376236 |
| Hub with target nvars: 13 alpha: 0.10 | 0.1037840 |
| Orphan with target nvars: 5 alpha: 0.10 | 0.0666755 |
| base.model | r.squared |
|---|---|
| Elastic Net with target nvars: 3 alpha: 0.60 | 0.6595686 |
| Hub with target nvars: 13 alpha: 0.10 | 0.4239360 |
| Orphan with target nvars: 5 alpha: 0.10 | 0.1549383 |
| base.model | r.squared |
|---|---|
| Elastic Net with target nvars: 3 alpha: 0.60 | 0.0140233 |
| Hub with target nvars: 13 alpha: 0.10 | 0.1039123 |
| Orphan with target nvars: 5 alpha: 0.10 | 0.0916150 |
| metric | model | base.model | target.vars | alpha | mean | std | median | min |
|---|---|---|---|---|---|---|---|---|
| C-Index (Train set) | Hub | Elastic Net | 3 | 0.60 | 0.766 | 0.085 | 0.777 | 0.398 |
| C-Index (Train set) | Hub | Hub | 13 | 0.10 | 0.812 | 0.050 | 0.812 | 0.63 |
| C-Index (Train set) | Hub | Orphan | 5 | 0.10 | 0.700 | 0.066 | 0.705 | 0.445 |
| C-Index (Train set) | Elastic Net | Elastic Net | 3 | 0.60 | 0.764 | 0.074 | 0.781 | 0.472 |
| C-Index (Train set) | Elastic Net | Hub | 13 | 0.10 | 0.878 | 0.040 | 0.878 | 0.741 |
| C-Index (Train set) | Elastic Net | Orphan | 5 | 0.10 | 0.767 | 0.076 | 0.769 | 0.5 |
| C-Index (Train set) | Orphan | Elastic Net | 3 | 0.60 | 0.781 | 0.057 | 0.792 | 0.519 |
| C-Index (Train set) | Orphan | Hub | 13 | 0.10 | 0.867 | 0.048 | 0.877 | 0.651 |
| C-Index (Train set) | Orphan | Orphan | 5 | 0.10 | 0.750 | 0.136 | 0.788 | 0.298 |
| Log-rank (Train set) | Hub | Elastic Net | 3 | 0.60 | 0.066 | 0.072 | 0.038 | 0.000407 |
| Log-rank (Train set) | Hub | Hub | 13 | 0.10 | 0.021 | 0.032 | 0.011 | 0.000271 |
| Log-rank (Train set) | Hub | Orphan | 5 | 0.10 | 0.094 | 0.080 | 0.077 | 6e-04 |
| Log-rank (Train set) | Elastic Net | Elastic Net | 3 | 0.60 | 0.355 | 0.243 | 0.308 | 0.00251 |
| Log-rank (Train set) | Elastic Net | Hub | 13 | 0.10 | 0.025 | 0.030 | 0.012 | 0.000303 |
| Log-rank (Train set) | Elastic Net | Orphan | 5 | 0.10 | 0.284 | 0.268 | 0.193 | 0.000734 |
| Log-rank (Train set) | Orphan | Elastic Net | 3 | 0.60 | 0.358 | 0.240 | 0.306 | 1.54e-13 |
| Log-rank (Train set) | Orphan | Hub | 13 | 0.10 | 0.058 | 0.080 | 0.040 | 0.000671 |
| Log-rank (Train set) | Orphan | Orphan | 5 | 0.10 | 0.197 | 0.208 | 0.098 | 0.000965 |
Distribution of Log-rank test with groups separated by high and low risk groups
Distribution of Log-rank test with groups separated by high and low risk groups
| base.model | r.squared |
|---|---|
| Elastic Net with target nvars: 3 alpha: 0.60 | 0.0214725 |
| Hub with target nvars: 13 alpha: 0.10 | 0.0371328 |
| Orphan with target nvars: 5 alpha: 0.10 | 0.0113305 |
| base.model | r.squared |
|---|---|
| Elastic Net with target nvars: 3 alpha: 0.60 | 0.2178867 |
| Hub with target nvars: 13 alpha: 0.10 | 0.0000101 |
| Orphan with target nvars: 5 alpha: 0.10 | 0.0058999 |
| base.model | r.squared |
|---|---|
| Elastic Net with target nvars: 3 alpha: 0.60 | 0.0005798 |
| Hub with target nvars: 13 alpha: 0.10 | 0.0004029 |
| Orphan with target nvars: 5 alpha: 0.10 | 0.0005280 |
i.e. with pvalue < 0.05 in Log-rank test using the test set.
i.e. with pvalue < 0.05 in Log-rank test using the test set.
| Gene | Overlap | total |
|---|---|---|
| ENSG00000278674 | 2 | 571 |
| PRR27 | 2 | 571 |
| GAGE2A | 2 | 395 |
| DEFA1 | 2 | 241 |
| UBC | 1 | 218 |
| SRC | 1 | 218 |
| PRDM10 | 1 | 184 |
| INS | 1 | 137 |
| CDK2 | 1 | 125 |
| AKT1 | 1 | 101 |
| UMPS | 1 | 65 |
| TP53 | 1 | 61 |
| RAD51 | 1 | 49 |
| GAPDH | 1 | 48 |
| MYC | 1 | 44 |
| CSN2 | 2 | 38 |
| ASPDH | 2 | 38 |
| FOS | 1 | 36 |
| AC068946.1 | 2 | 33 |
| UBB | 1 | 31 |
ENSG00000278674(571), PRR27(571), GAGE2A(395), DEFA1(241), UBC(218), SRC(218), PRDM10(184), INS(137), CDK2(125), AKT1(101), UMPS(65), TP53(61), RAD51(49), GAPDH(48), MYC(44), CSN2(38), ASPDH(38), FOS(36), AC068946.1(33), UBB(31), CABS1(29), MIA-RAB4B(29), CDK1(28), LIPN(26), PRKCG(26), MAPK3(25), POTEI(25), CACNG3(24), NUTM2F(20), LHX5(19), NRAS(17), PCSK9(16), NKX1-2(14), ELOVL3(14), HSP90AA1(13), CCDC150(12), AC006486.1(11), KRTAP19-3(10), NUTM2F(10), GSK3A(9), HRAS(8), SLC5A10(7), KRTAP19-7(7), LHX5(7), ACTRT2(7), SLC25A47(6), PCSK9(6), CACNG3(5), OR2T35(5), OR2T2(5), FBXO47(5), ITGA2B(5), RPS6KB1(5), OR2T35(5), TPTE(4), SPINK6(3), MTCP1(3), GAS2(3), PCDHA8(3), MAGEC3(2), OR2W5(2), TARM1(2), AC104581.2(2), RHOA(2), PRSS53(2), FGF11(2), OR2T2(2), CDK3(2), MAGEA9B(2), ENSG00000183791(1), CDC42(1), MAPK1(1), MYB(1), THRSP(1), OR4S1(1), KRTAP10-8(1), ASPDH(1), POM121L12(1), NKX1-2(1), KRTAP19-3(1), PRSS51(1), AC104581.2(1), OR1S1(1)
[INFO] Coefs. list
[1] “CACNG3, DEFA1, LHX5, OR4S1”
[INFO] Coefs. list
[1] “AKT1, CDK2, PRDM10, SRC, UBC”
[INFO] Coefs. list
[1] “ENSG00000278674, GAGE2A, PRR27”
[INFO] Coefs. list
[1] “DEFA1, ENSG00000278674, PRR27”
[INFO] Coefs. list
[1] “CDK2, FOS, GAPDH, INS, MAPK3, MYC, PRDM10, PRKCG, RAD51, SRC, TP53, UBC, UMPS”
[INFO] Coefs. list
[1] “DEFA1, ENSG00000278674, GAGE2A, PRR27”
NULL
## [INFO] balanced.sets: TRUE
## [INFO] calc.params.old: FALSE
## [INFO] coding.genes: TRUE
## [INFO] degree.correlation: pearson
## [INFO] degree.cutoff: 0.000
## [INFO] degree.type: string
## [INFO] degree.unweighted: TRUE
## [INFO] handle.duplicates: keep_first
## [INFO] log2: TRUE
## [INFO] n.cores: 14.000
## [INFO] normalization: center
## [INFO] ntimes: 5000.000
## [INFO] project: prad.data.2018.10.11
## [INFO] seed: 1985.000
## [INFO] subset: Inf
## [INFO] subtract.surv.column: (I do not know how to display this)
## [INFO] target.vars: list(alpha = 0.6, vars = 3), list(alpha = 0.1, vars = 13), list(alpha = 0.1, vars = 5)
## [INFO] tissue: primary.solid.tumor
## [INFO] train: 0.800